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Abstract 

We discuss projection on the intersection of a polyhedral convex cone and a sphere, 
in particular when the target is in the polar cone. We also discuss projection on the 
double cone formed by the cone and its negative. 
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Note: This is a working paper which will be expanded/updated frequently. All suggestions 
for improvement are welcome. The directory deleeuwpdx.net/pubfolders/dcone has a pdf 
version, the bib hie, and the complete Rmd hie. 

1 Introduction 

Suppose £ is a hnite-dimensional Euclidean space with inner product (•, •) and corresponding 
norm || • ||. Also S is the unit sphere in £, and B is the unit ball. 

The Gih framework for descriptive multivariate analysis (Gih (1990), Michailidis and De Leeuw 
(1998), De Leeuw and Mair (2009)) covers both linear and bilinear nonmctric multivariate 
methods. In linear nonmetric techniques such as nonmetric regression, discriminant analysis, 
and additivity analysis we have to solve the problem of minimizing cr(y,x) = ||a: — y || 2 over 
y G L and over x G /C fl S, where K, C £ is a closed and pointed polyhedral convex cone and 
C C £ is a linear subspace. In bilinear nonmetric methods such as principal component and 
canonical analysis this minimization problems has to be solved many times. 
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The standard way to solve this minimization problem is to use alternating least squares. We 
alternate finding the optimal y for given x and the optimal x for given y until convergence. 
The optimal y for given x is just a simple linear least squares problem. The optimal x for 
given y is a normalized cone regression problem (De Leeuw (1975)). It can usually be solved 
by first projecting y on the cone /C and then normalizing the projection to unit length. 

It has typically not been emphasized in the data analysis literature that normalized cone 
regression can go astray, especially in early iterations and in situations where the least squares 
fit is very bad. In these, admittedly rare, cases the projection of y on the cone tC is the zero 
vector, and we cannot normalize a zero vector. 

2 History 

Gift (1990) briefly discusses the problem on pages 529-530, which happen to be the very last 
pages of the book, right before the references and the index. We give the relevant quotation, 
for those who do not want to spend the $ 330.95 to buy the book. 

In the book we sometimed use normalized cone regression. This can be either 
one of tow things. In the first place minimization of (y — x)'W(y — x)/x'Wx 
over x in a cone C. This called implicit, normalization. This name suggests 
there is also something like explicit normalization. This is minimization of 
(y — x)'W(y — x) over all x G C that satisfy in addition x’Wx=l. It is basic result 
of Kruskal and Carroll (1969) that in this simple case implicit normalization, 
explicit normalization, and no normalization all give essentially the same solution. 

All solutions are proportional to the projection of y on the cone, with only the 
proportionality constant different for the different problems. The result does not 
rely on convexity. There is one exception which should be noted. If y is in C e , 
i.e y'Wz < 0 for all z G C, then the origin is the projection of y on C. In the 
normalized problems the solution of x is one of the extreme rays of C, suitable 
normalized. An extreme ray is any ray in the cone that cannot be written as a 
nonnegative linear combination of two other rays in the cones. If C is a subspace 
and y is in the IT-orthogonal complement, then the infimum in the implicitly 
normalized problem is one - not attained, but approached by letting x —> oo. 

The infimum in the explicitly normalized problem is attained for any iGC with 
x'Wx = 1; it is equal to 1 + y'Wy. If C is the cone used in monotone regression, 
then the extreme rays are the vectors with the first k element equal to zero and 
the last n — k elements equal to one (k = 1, • • • , n — 1) together with the vectors 
u and — u which span the intersection of C and C e . Thus the cone is not pointed. 

The normalized regression problem must be solved by testing all these rays and 
by keeping the best one. 

Example 5.5.2 in Lange (2016) discusses the special case in which K is the non-negative 
orthant. On page 142 we see 

The constraint set C in question is the intersection of the unit sphere and the 
nonnegative orthant. Projection of an external point y onto C splits into several 
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cases. When all components of y are negative, Pc(y ) = e*, where yi is the least 
negative component, of y, and e* is the standard unit vector along coordinate 
direction i. The origin O is equidistant from all all points of C. If any component of 
y is positive then the projection is constructed by setting the negative components 
of y equal to 0 and standardicing the truncated version of y to have norm 1. 

Neither Gih nor Lange provide an actual proof for their statements. Exercise 5.6.14 in Lange 
(2016) asks the reader to come up with a proof. All the machinery and proofs one can possibly 
want were recently made available in Bauschke, Bui, and Wang (2018). That article provides 
a very thorough and complete analysis of projection on the intersection of a cone and a sphere 
(or a ball). On page 2158 the authors say 

Inspired by an example in the recent and charming book [12], our aim in this 
paper is to systematically study the case in which K is a closed convex cone and S 
is either the (convex) unit ball or (nonconvex) unit sphere centered at the origin. 

The “recent and charming” book in the quotation is, of course, Lange (2016). The authors 
do not refer to Gifi, which is probably because Gib’s book is neither recent nor charming. 


3 Unnormalized Cone Projection 

We review some basic facts about unnormalized cone projection problems. Any book on 
convex analysis (for example Rockafellar (1970)) will provide the necessary discussion and 
proofs. 

Definition 1: [Projection] The projection Pa(x) of x on a set A C £ is the set of all y € A 
such that ||a; — y\\ = nrin 26j 4 ||a; — z\\. Projections need not be singletons and can be empty. 

Definition 2: [Polar Cone] The polar cone K e of a cone K is the set of all y such that 
(x, y) < 0 for all x e K. 

Result 1: [Cone Projection] 

1. If K is convex the projection is a singleton. In that case we also use Pr(x) for the 
unique element of the projection. 

2. y = Pr{x ) if and only if (y, x — y) = 0 and (x, x — y) < 0 for all x e K. Thus y = Pr(x) 
if and only if x — y = Pko(x). 

3. x = Pk{x) + Pxe(x) or P K e{x) = x — Pr(x). 

4. \\xf = \\P K (x)\\ 2 + \\P K e(x)r 

5. The projection Pk(x) of x on K is zero if and only if x G K e . 

Here are two more debnitions for later use. 

Definition 3: [Negative of a Cone] The negative K~ of a cone K is the set of all y such 
that y = —x for some x G K. 

Definition 4: [Double of a Cone] The double K* of a cone K is the double cone K U K~. 
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In the figure below the cone is red, the polar is the union of green and blue, the negative is 
blue, and the double is the union of red and blue. 



4 Normalized Cone Regression 

We give a brief summary of the results in De Leeuw (1975) with somewhat different proofs. We 
discuss the relations between unnormalized, explicitly normalized, and implicitly normalized 
cone regression. Similar results were first discussed, somewhat informally, in Kruskal and 
Carroll (1969). 

4.1 Target not in Polar Cone 

Theorem 1: [Unnormalized] 

1. If y G KP then min xe ^; ||x — y\\ 2 = ||?/|| 2 , and the minimum is attained for x — 0. 

2. If y G £\/C e then argmin x . gyc ||x — y\\ 2 = (x,y)x, with x G argmax IgJCnS (x,?/). 
Proof: All we need is the following simple decomposition of the problem. 

min ||a; — y\\ 2 = min min \\ax — y\\ 2 = 


xeK.CS a >o 
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If y E /C e then rnax(0, (x, y)) = 0 for all x E 1C. M 

Corollary 1: [Non-Polar Target] 

If y E £\JC e then 


P)Cns(x) = 


Pidx ) 

I Pk{x) 


Proof: By theorem 1 we have P/c(x) = (x,y)x, where x 
and the result follows. ■ 


Theorem 2: [Explicit Normalization] 


Pk.cs- Thus \\Pk.(x)\\ 


{x,y), 


argmin ||a; — y\\ 2 = argmax(x,t/). 

xGicns x£icns 


Proof: In this case 

min \\x — y\\ 2 = 1 + ||y|| 2 — 2 max (x, y) 
xeicns " y " xeicns x ,y/ 


Theorem 3: [Implicit Normalization] 


argmin 

x€K 



argmax(i, y). 

xGicns 


Proof: In this case 


min 

x€K. 



min min 

xGKnS a >0 


« 2 + |b|| 2 -2a(x,y) 


min min i + f3 2 \\y\\' 2 - 2 f3(x,y) = 1 

xeicns (3 >o 


,, MO max max(0, (x, y)Y 
||y|| 2 x&tcns v N ,y/y 


Thus when y E £\IC e , for both normalized and unnormalized cone projection, it suffices to 
compute 

argmax(i, ?/) = argmax(rr, y), 
xGicns x£K,n B 

i.e. to compute the support function of the compact convex set /C fl B at y. 

4.2 Target in Polar Cone 

Theorem 4: [Polar Target] Suppose /C is the set of all non-negative linear combinations 
of m distinct unit-length vectors z r If y E /C e then 

max(t/, x) = ma x(y,Zj). 
xeicns j=i 
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Proof: We have x G 1C fl S if and only if 


x = a(A) ^2 X j z ji 
3 =i 

where the 0 < Xj < 1, and the A j add np to one, and where 


a(A) 


TT=1 Vtl 


Because the norm is stricly convex we have 


II ’^2 X j z j II — Ailktll — 1; (2) 

3 =1 3 =1 

and thus a(A) > 1 with equality if and only if exactly one of the A j is equal to one (and the 
others are zero). 

Because y G /C e in addition 


( 3 ) 

Thus, from (2) and (3), 

m 

(y,x) = a(A) Y, X AVi z i) < (y, z j)- 

3 =1 j 


m 

J2 x j(V’ z i) - <j/> < o- 

3 =1 3 


5 Double Cone Regression 

If we come across an instance, in our alternating least squares iterative processes, where the 
target is in the polar cone it may be too much work to test all extreme rays of the cone to 
find the projection on 1C fl S. There is, however, a convenient way around the problem. 

Remember that in our algorithms we alternate minimization of ||a: — y\\ 2 over y G C for 
fixed x and over x G 1C fl S for fixed y. Thus we are in trouble if in some iteration we have 
y G £n/C e . The ad hoc solution, which is basically what is used in some of the Gifi programs, 
is to compute P/cns(~y ) an d then replace y by —y. This essentially means we project y on 
the intersection of S and the double cone K* = 1C U Kr. 

Theorem 5: [Negative] 

P K ~(x) = P K (-x ) 

Proof: We have y = Pjc(-y) if and only if (x, x + y) < 0 for all x G 1C and (y,x + y) = 0. We 
have y = P' K - if and only if (x, x — y) <0 for all x G )C~, i.e. if and only if {—x, —x — y) — 
(x,x + y) <0 for all x G 1C. ■ 
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In an actual implementation we can follow two strategies. The first is to compute P/c{y) and 
Pk-(v) = Pd — y ) and normalize the one with the best fit. The second strategy, which is 
slightly more economical, is to compute P/c(y). If this is nonzero, we normalize it. If it is 
zero, we compute P/c(—y) and normalize it (and replace y by — y). 
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